Mining Dense Patterns from Off Diagonal Protein Contact Maps

نویسندگان

  • M. Om Swaroopa
  • K. Suvarna Vani
چکیده

The three dimensional structure of proteins is useful to carry out the biophysical and biochemical functions in a cell. Protein contact maps are 2D representations of contacts among the amino acid residues in the folded protein structure. Proteins are biochemical compounds consisting of one or more polypeptides, facilitating a biological function. Many researchers make note of the way secondary structures are clearly visible in the contact maps where helices are seen as thick bands and the sheets as orthogonal to the diagonal. In this paper, we explore several machine learning algorithms to data driven construction of classifiers for assigning protein off diagonal contact maps. A simple and computationally inexpensive algorithm based on triangle subdivision method is implemented to extract twenty features from off diagonal contact maps. This method successfully characterizes the off-diagonal interactions in the contact map for predicting specific folds. NaiveBayes, J48 and REPTree classification results with Recall 76. 38%, 91. 66% and 80. 32% are obtained respectively.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Protein Contact Maps

The 3D conformation of a protein may be compactly represented in a symmetrical, square, boolean matrix of pairwise, inter-residue contacts, or “contact map”. The contact map provides a host of useful information about the protein’s structure. In this paper we describe how data mining can be used to extract valuable information from contact maps. For example, clusters of contacts represent certa...

متن کامل

Mining of protein contact maps for protein fold prediction

Contact maps have been used in ab initio methods for the problem of protein structure prediction problem. Secondary structures and contacts made by the residues are clearly visible in the contact maps where helices are seen as thick bands and the beta sheets are seen as orthogonal to the diagonal. This paper explores the idea of extracting rules from contact maps to represent “protein fold” inf...

متن کامل

LGM: Mining Frequent Subgraphs from Linear Graphs

A linear graph is a graph whose vertices are totally ordered. Biological and linguistic sequences with interactions among symbols are naturally represented as linear graphs. Examples include protein contact maps, RNA secondary structures and predicate-argument structures. Our algorithm, linear graph miner (LGM), leverages the vertex order for efficient enumeration of frequent subgraphs. Based o...

متن کامل

A Fast Block Low-Rank Dense Solver with Applications to Finite-Element Matrices

1. Abstract. This article presents a fast dense solver for hierarchically off-diagonal low-rank (HODLR) matrices. This solver uses algebraic techniques such as the adaptive cross approximation (ACA) algorithm to construct the low-rank approximation of the off-diagonal matrix blocks. This allows us to apply the solver to any dense matrix that has an off-diagonal low-rank structure without any pr...

متن کامل

Mining Attribute-structure Correlated Patterns in Large Attributed Graphs

In this work, we study the correlation between attribute sets and the occurrence of dense subgraphs in large attributed graphs, a task we call structural correlation pattern mining. A structural correlation pattern is a dense subgraph induced by a particular attribute set. Existing methods are not able to extract relevant knowledge regarding how vertex attributes interact with dense subgraphs. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012